Cheshire II at INEX ’03: Component and Algorithm Fusion for XML Retrieval

نویسنده

  • Ray R. Larson
چکیده

This paper describes the retrieval approach that UC Berkeley used in the 2003 INEX evaluation. As in last year’s INEX, our primary approach is the combination of a probabilistic methods using a Logistic regression algorithm for estimation of document (article) relevance and/or element relevance, along with Boolean constraints. This year we also used data fusion techniques to combine results from multiple probabilistic retrieval algorithms and multiple search elements for any given query. All of our runs were fully automatic with no manual editing or interactive submission of queries.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Cheshire II at INEX: Using a Hybrid Logistic Regression and Boolean Model for XML Retrieval

This paper describes the retrieval approach that Berkeley used in the INEX evaluation. The primary approach is the combination of a probabilistic methods using a Logistic regression algorithm for estimation of collection relevance and element relevance, along with Boolean constraints. The paper also discusses our approach to XML component retrieval and how component and document retrieval are c...

متن کامل

Component Ranking and Automatic Query Refinement for XML Retrieval

Queries over XML documents challenge search engines to return the most relevant XML components that satisfy the query concepts. In a previous work[6] we described an algorithm to retrieve the most relevant XML components that performed relatively well in INEX'03. In this paper we show an improvement to that algorithm by introducing a document pivot that compensates for missing terms statistics ...

متن کامل

Cooperative XML ( CoXML ) Query Answering at INEX 03

The Extensible Markup Language (XML) is becoming the most popular format for information representation and data exchange. Much research has been investigated on providing flexible query facilities while aiming at efficient techniques to extract data from XML documents. However, most of them are focused on only the exact matching of query conditions. In this paper, we describe a cooperative XML...

متن کامل

Proceedings of the Fifth Dutch - Belgian Information Retrieval

Todays content is increasingly a mixture of text, multimedia, and metadata. One way to format this mixed content is according to the adopted W3C standard for information repositories, the so-called eXtensible Markup Language (XML). The increasing use of XML in scientific data repositories, Digital Libraries and on the Web, has brought about an explosion in the development of XML tools, and in p...

متن کامل

Passage Retrieval and other XML-Retrieval Tasks

At INEX there is an underlying assumption that XML-retrieval and element retrieval are one and the same. This is, in fact, not the case. The hypothesis at INEX is that XML markup is useful for information retrieval. We firmly believe this, but no longer in element retrieval. In this contribution we examine in detail the evidence collected in support of element retrieval and suggest that, contra...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003